IR with and without GA: Study the Effectiveness of the Developed Fitness Function on the Two Suggested Approaches

نویسندگان

  • Ammar Al-Dallal
  • Rasha S. Abdul-Wahab
  • Ramzi El-Haddadeh
چکیده

This paper proposes two IR approaches; the first is IR with GA, which is a GA-based IR approach. This approach introduces modified GA operators that allow IR with GA to achieve high performance. The second IR model is IR without GA, which is based on traditional IR approach. Both enhance the precision and recall of the web search by improving the document representation where an enhanced inverted index is developed for this purpose. Moreover, these two models use the same proposed evaluation function for measuring the document relativity to the user query. A number of experiments were conducted to compare the performance of the two suggested approaches with existing techniques. The two suggested approaches were then compared experimentally with another two techniques of classical IR namely Okapi-BM25 fitness function and Bayesian inference network model from documents quality of retrieval perspective. The obtained results demonstrate a good level of enhancement to the recall and precision times. In addition, the documents retrieved by IR with and without GA are more accurate and relevant to the queries than that retrieved by other techniques. Overall, the two suggested approaches provide a promising technique in web search domain delivering a high quality search results in terms of recall and precision. DOI: 10.4018/jamc.2013010101 2 International Journal of Applied Metaheuristic Computing, 4(1), 1-20, January-March 2013 Copyright © 2013, IGI Global. Copying or distributing in print or electronic forms without written permission of IGI Global is prohibited. (GA). The GA is biologically inspired and has many mechanisms derived from natural evolution. Because of its parallel mechanism with high-dimensional space, GA has been used to solve many of optimization problems. This in turn led to encourage researchers for using this algorithm in IR. Besides, GA plays an important approach to provide suitable information for the user’s needs. IR and GA are integrated to avoid web users suffering from specific problems when trying to retrieve useful information such that: Many of the retrieved documents are not related to the user query which is called low precision, and many of relevant documents have not been retrieved yet which is called low recall. We aim at this work to provide better solution for these problems by enhancing these two measures, namely, precision and recall. This paper proposed two IR approaches; the first one is IR with GA, which is a GAbased IR approach. This approach introduced modified GA operators that allowed IR with GA to achieve high performance. The second approach is IR without GA, which is based on traditional IR approach. Both are used to enhance the precision and recall of the web search by means of improving the document representation where an enhanced inverted index is developed for this purpose. Moreover, these two models are using the same proposed evaluation function for measuring the document relativity to the user query. In addition, these two suggested approaches follow the same process of indexing documents which is called enhanced inverted index. For each term in the indexing model, the index stores its position from the beginning of the document and its position within the sentence in addition to the list of all documents referencing it. The rest of this paper is organized as follows: in section 1, we describe problem statement and objectives. Section 2 discusses related works. Document representation and indexing documents is introduced in section 3. Section 4 introduces the proposed approach that has implemented with GA. While in section 5 the architecture of IR without GA is presented. Section 6 represents the experimental results of the proposed method. Section 7 gives the conclusions of this study. 2. PROBLEM STATEMENT AND OBJECTIVES Every internet user wishes to have satisfied results when using any web search engine. Satisfactions in sense that all the retrieved results are relevant and all relevant documents are retrieved; in another words, the web user is satisfied when the search engine retrieves all and only the relevant documents. In spite of several enhancements are achieved in such searching techniques, still web users suffering from two major problems when trying to retrieve useful information. Therefore, the main objectives of this paper are to propose two IR approaches to enhance the precision and recall of the retrieved web documents.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Efficient Genetic Agorithm for Solving the Multi-Mode Resource-Constrained Project Scheduling Problem Based on Random Key Representation

In this paper, a new genetic algorithm (GA) is presented for solving the multi-mode resource-constrained project scheduling problem (MRCPSP) with minimization of project makespan as the objective subject to resource and precedence constraints. A random key and the related mode list (ML) representation scheme are used as encoding schemes and the multi-mode serial schedule generation scheme (MSSG...

متن کامل

A Multi-Mode Resource-Constrained Optimization of Time-Cost Trade-off Problems in Project Scheduling Using a Genetic Algorithm

In this paper, we present a genetic algorithm (GA) for optimization of a multi-mode resource constrained time cost trade off (MRCTCT) problem. The proposed GA, each activity has several operational modes and each mode identifies a possible executive time and cost of the activity. Beyond earlier studies on time-cost trade-off problem, in MRCTCT problem, resource requirements of each execution mo...

متن کامل

FEASIBILITY OF PSO-ANFIS-PSO AND GA-ANFIS-GA MODELS IN PREDICTION OF PEAK GROUND ACCELERATION

In the present study, two new hybrid approaches are proposed for predicting peak ground acceleration (PGA) parameter. The proposed approaches are based on the combinations of Adaptive Neuro-Fuzzy System (ANFIS) with Genetic Algorithm (GA), and with Particle Swarm Optimization (PSO). In these approaches, the PSO and GA algorithms are employed to enhance the accuracy of ANFIS model. To develop hy...

متن کامل

PMU Placement Methods in Power Systems based on Evolutionary Algorithms and GPS Receiver

In this paper, optimal placement of Phasor Measurement Unit (PMU) using Global Positioning System (GPS) is discussed. Ant Colony Optimization (ACO), Simulated Annealing (SA), Particle Swarm Optimization (PSO) and Genetic Algorithm (GA) are used for this problem. Pheromone evaporation coefficient and the probability of moving from state x to state y by ant are introduced into the ACO. The modifi...

متن کامل

Scheduling on flexible flow shop with cost-related objective function considering outsourcing options

This study considers outsourcing decisions in a flexible flow shop scheduling problem, in which each job can be processed by either an in-house production line or outsourced. The selected objective function aims to minimize the weighted sum of tardiness costs, in-house production costs, and outsourcing costs with respect to the jobs due date. The purpose of the problem is to select the jobs tha...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Int. J. of Applied Metaheuristic Computing

دوره 4  شماره 

صفحات  -

تاریخ انتشار 2013